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Abstract: 

We consider the problem of obtaining locally D-optimal designs for fac- 
torial experiments with binary response and k qualitative factors at two 
levels each. Yang, Mandal and Majumdar (2011) considered this problem 
for 2^ factorial experiments. In this paper, we generalize the results for 
2^ designs and explore in new directions. We obtain a characterization 
for a design to be locally D-optimal. Based on this characterization, we 
develop efficient numerical techniques to search for locally D-optimal de- 
signs. We also investigate the properties of fractional factorial designs 
and study the robustness, with respect to the initial parameter values of 
locally D-optimal designs. Using prior distribution on the parameters, 
we investigate EW D-optimal designs, that are designs which maximize 
the determinant of the expected information matrix. It turns out that 
these designs are much easier to find and still highly efficient compared to 
Bayesian D-optimal designs, as well as quite robust. 

Key words and phrases: Generalized linear model, full factorial design, 
fractional factorial design, D-optimality, uniform design, EW D-optimal 
design 



1 Introduction 

The goal of many scientific and industrial experiments is to investigate the effect of 
several factors on a response that is binary. Our objective is to explore optimal and 
efficient designs for these experiments. In Yang, Mandal and Majumdar (2011) we 
considered this problem for the 2^ factorial experiment. In this paper, we consider 
the general 2'^ case, that is, experiments with k factors at two levels each. The factors 
are either qualitative, or quantitative with only two levels possible. In other words. 
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the experimenter lacks the option of adjusting the levels of each factor beyond the 
two fixed ones. 

Instances of experiments of the sort we are interested in abound in different areas 
of application. Miller and Sitter (2001) described an experiment with 12 runs that 
investigated ways of reducing the level of toxic contaminant (stripper) from the waste 
stream of a chemical operation. The experiment considered nine factors at two levels 
each. Grimshaw et al. (2001) described an experiment on the vision-based targeting 
component of an automatic car-refueling system that was performed at Brigham 
Young University in 1998. The objective of the experiment was to find out if the 
robotic system could consistently insert the nozzle at the end of a telescoping arm 
into a gas inlet pipe or not. This experiment had ten factors at two levels each. Seo, 
Goldschmidt-Clermont and West (2007) discussed an experiment where the response 
could be whether or not a particular symptom is developed in mice. This study 
investigated three key environmental factors, age, gender and dietary fat intake along 
with a key genetic factor related to the ApoE (Apolipoprotein E) gene pathway. 
Hamada and Nelder (1997) discussed a 2^^^ fractional factorial experiment performed 
at IIT Thompson laboratory that was originally reported by Martin, Parker and 
Zenick (1987). This was a windshield molding slugging experiment where the outcome 
was whether the molding was good or not. There were four factors each at two 
levels: (A) poly-film thickness (0.0025, 0.00175), (B) oil mixture ratio (1:20, 1:10), 
(C) material of gloves (cotton, nylon), and (D) the condition of metal blanks (dry 
underside, oily underside). Vasandani and Govindaraj (1995) considered knowledge 
organization in intelligent tutoring systems where the objective was to determine 
whether a system fails or not. Street and Burgess (2007) reported the study conducted 
by Severin (2000) where she investigated six attributes (pizza type, type of crust, 
ingredients, size, price and delivery time) each at two levels to examine whether they 
make take-out pizza outlets more attractive or not. Nair et al. (2008) reported a 
2^~^ fractional factorial experiment used in the study called "Project Quit" , a Center 
for Health Communications Research project on smoking cessation. They studied six 
factors each at two levels and the primary outcome measure was abstinence over a 
7-day period assessed by the question, "Did you smoke a tobacco cigarette in the past 
7 days?" 

In many real-life applications, interest lies in accessing a situation with a binary 
outcome {e.g., in drug discovery, whether a compound is toxic or not, or in car 
manufacturing, whether a sensor works or not) but the practitioners often use a 
surrogate continuous response in order to use the optimal designs obtained under 
linear models. 

We assume that the process under study is adequately described by a generalized 
linear model (GLM). GLMs have been widely used for modeling binary response. 
Stufken and Yang (2012) noted that "the study of optimal designs for experiments 
that plan to use a GLM is however not nearly as well developed (see also Khuri, 
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Mukherjee, Sinha and Ghosh, 2006), and tends to be much more difficult than the 
corresponding and better studied problem for the special case of linear models." 

The optimal designs are obtained using the D-optimality criterion that maximizes 
the determinant of the information matrix. In order to overcome the difficulty posed 
by the dependence of the design optimality criterion on the unknown parameters, we 
primarily use the local optimality approach of Chernoff (1953) in which the parameters 
are replaced by assumed values. We refer the reader to the paper by Khuri, Mukherjee, 
Sinha, and Ghosh (2006) for details of the theory of designs under generalized linear 
models. 

We obtain theoretical results and develop algorithms for obtaining locally optimal 
designs. Experiments with factors at two levels each are often used in screening ex- 
periments, where the objective is to determine which factors among a large set are 
effective. Although we explore designs for "full factorials", i.e., ones in which obser- 
vations are taken at every possible level combination, when the number of factors is 
large, full factorials are practically infeasible. Hence the study of "fractional facto- 
rial" designs occupies a substantial part of the linear-model based design hterature, 
and we too study these designs in our generalized linear model setup. 

A natural question that arises when we use local optimality is whether the optimal 
designs are robust. The study of robustness with respect to the initial parameter 
values of locally D-optimal designs, therefore, is a significant part of our focus. 

For situations where reliable local values of the parameters are difficult to obtain but 
the experimenter may be able to specify a prior distribution, we suggest EW optimal 
designs, where the information matrix is replaced by its expectation under the prior. 
This is one of the suggested alternatives to formal Bayes optimality in Atkinson, 
Donev, and Tobias (2007). It has been used by Zayats and Steinberg (2010) for 
optimal designs for detection capability of networks. Effectively this reduces to a 
locally optimal design with local values of the "weight" parameters replaced by their 
expectations. It turns out that the EW optimal designs have strong efficiency and 
robustness characteristics as local designs, and in addition, they are very good and 
easy-to-compute surrogates for Bayes optimal designs. 

Beyond theoretical results, the question that may be asked is whether these results 
give the user any advantage in real experiments. Would the experimenter gain much 
by using these results instead of designs that are optimal for the linear model, for 
instance, uniform complete designs or regular fractions. After all, on 2^ experiments, 
Yang, Mandal and Majumdar (2011) concluded that the uniform design (i.e., full 
factorial) had many desirable properties. In the more complex and more realistic 
setup of 2^ experiments considered in the present work it turns out that the answer 
to the question posed above is a definite "yes" . In most situations we gain considerably 
by taking advantage of the results of this paper instead of using standard linear-model 
results. We briefly outline two conclusions that may be derived from our work. 
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First, unlike the linear model case, all regular fractions are not equivalent. Indeed, 
if we have some knowledge of the prior parameters, we will be able to use this to 
identify a fractional factorial design that may not be a regular fraction. Second, even 
though in the absence of any knowledge of the parameters, an uniform design may 
be the best, even with a little prior information, we will be able to choose an EW 
optimal design that is more robust with respect to the initial parameter values of 
locally D-optimal designs, in terms of minimizing the maximum relative loss. 

As mentioned in Yang, Mandal and Majumdar (2011), the paper by GraBhoff and 
Schwabe (2008) has some relevant results for the k = 2 factor case. For the gen- 
eral case, Gonzalez-Davila, Dorta-Guerra and Ginebra (2007), and Dorta-Guerra, 
Gonzalez-Davila and Ginebra (2008) obtained some results for 2'' experiments with 
binary response. They obtained an expression for the D-criterion and studied several 
special cases. 

This paper is organized as follows. In Section 2 we describe the preliminary setup. 
In Section 3 we provide several results for the locally D-optimal designs including 
the uniqueness of the D-optimal designs, characterization for a design to be locally 
D-optimal, the concept of EW D-optimal designs, and algorithms for searching D- 
optimal designs. In Section 4 we discuss the properties of fractional factorial designs. 
We address the robustness of D-optimal designs in Section 5 and conclude with some 
examples and remarks in Section 6. Proofs and some details about the algorithms 
are relegated to the Appendix and the Supplementary Materials of this paper. 

2 Preliminary Setup 

Consider a 2^^ experiment with binary response, i.e., an experiment with k explanatory 
variables at 2 levels each. Suppose rii units are allocated to the ith experimental 
condition such that Ui ^ 0, i = 1, . . . ,2^, and ni + ■ ■ ■ + = n. We suppose that 
n is fixed and the problem is to determine the "optimal" n^'s. In fact, we write our 
optimality criterion in terms of the proportions: 

Pi = rii/n, i = 1, . . . ,2^ 

and determine the "optimal" p^'s over real values. Since rij's are integers, an optimal 
design obtained in this fashion may not always be "feasible" . In Section 13.3.21 we will 
consider the design problem over integer n^'s. 

Suppose rj is the linear predictor that involves main effects and interactions which 
are assumed to be in the model. For instance, for a 2^ experiment with a model that 
includes the main effects and the two-factor interaction of factors 1 and 2, r/ = /3o + 
(3iXi+ /32X2 + (33X3 + f3i2XiX2, where each Xi G { — 1,1}. The aim of the experiment is to 
obtain inferences about the parameter vector of factor effects (3 = {(3q, (32, (3^, /3i2)'- 
In the framework of generalized linear models, the expectation of the response Y, 
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E (Y) = 7T, is connected to the linear predictor 77 by the hnk function g: rj = g (it) 
(McCullagh and Nelder (1989)). For a binary response, the commonly used link 
functions include logit, probit, log-log, and complimentary log- log links. 



The maximum likelihood estimator of (3 has an asymptotic covariance matrix (Mc- 
Cullagh and Nelder, 1989; Khuri, Mukherjee, Sinha and Ghosh, 2006) that is the 

inverse of nX'WX, where W = diag {wipi, W2kP2k} , uij = ( 4^ ) / (7rj(l — tTj)) > 



= (Sr. 

0, Tji and TTj correspond to the ith experimental condition for 77 and vr, and X is 
the "design matrix". For example, for a full factorial 2"^ 

?7 = /3o + PiXi + 132X2 + PzX2, + /3i2a;ia;2, 
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Note that the matrix X'WX may be viewed as the per-observation information ma- 
trix. A D-optimal design maximizing depends on the ifj's, which depend 
on the regression parameters (3 and the link function g. In this paper, we discuss 
D-optimal designs in terms of Wj's so that our results can be applied to all link 
functions. 



3 Locally D-Optimal Designs 

In this section, we consider locally D-optimal designs for the full factorial 2'^ ex- 
periment. The goal is to find an optimal p = {pi,p2, . . . ,P2k)' which maximizes 
/(p) = for specified values of > 0, i = 1, . . . , 2''. Here pi > 0, i = 1, . . . ,2^ 

and z^i^iPi = 1. It is easy to see that there always exists a D-optimal allocation p 
since the set of all feasible allocations is bounded and closed. On the other hand, the 
uniqueness of D-optimal designs is usually not guaranteed (see Remark [T]). 



3.1 Characterization of locally D-optimal designs 



Suppose the parameters (main effects and interactions) are f3 = (/3o, • • • , /3d)', 
where d > k. The following lemma expresses our objective function as an order- 
{d+1) homogeneous polynomial of pi, . . . ,p2k, which is useful in deriving optimal 
Pi's. 
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Lemma 3.1 Let X[ii,i2, . . . ,id+i] be the {d + 1) x {d + 1) sub-matrix consisting of 
the iith, i2th, . . ., id+ith rows of the design matrix X. Then 

f{v) = \X'WX\= ^ \X[ii,i2,...,id+i]? -p^Wi^Pi^Wi^-'-pi^^^^Wi^^^. 

I<ii<i2<---<«d+i<2'= 



Gonzalez-Davila, Dorta-Guerra and Ginebra (2007, Proposition 2.1) obtained essen- 
tially the same result. This can also be proved directly using the results from Rao 
(1973, Chapter 1). From Lemma [3. II it is immediate that at least {d + 1) Wj's, as well 
as the corresponding pi's, have to be positive for the determinant /(p) to be nonzero. 
This implies if p is D-optimal, then pi < 1 for each i. Theorem 13.11 below gives a 
sharper bound, pi < for each z = 1, . . . , 2^, for the optimal allocation. To check 
this we define for each i = 1, . . . ,2^, 

f I — z I — z 1 — z 1 — z \ 

Mz) = f[-, Pi, Pi-l,Z,- Pi+i,...,- P2k], 0<2<1. (2) 

V 1 - Pi I- Pi I- Pi I- Pi J 

Note that fi{z) is well defined for all p of interest. The next theorem characterizes 
all locally D-optimal allocations. 



Theorem 3.1 Suppose / (p) > 0. Then p is D-optimal if and only if for each 
i = 1, . . . ,2^^ , one of the two conditions below is satisfied: 



(i) p, = Oandf,{l) < ^/(p); 
(n) 0<p,<^^and /,(0) = '^^^f{p). 



A design that is minimally supported on {d + 1) points is called a saturated design. 
These designs are attractive because they require a minimum number of experimental 
settings. In our context, a design p = (pi, . . . ,^2'=)' with a 2'^ x (d + 1) design 
matrix X is called saturated if it contains exactly {d + 1) nonzero pis. Based on 
Lemma \3.1\ a saturated design p with pi^ > 0, . . . ,Pi^_^_^ > is D-optimal if and only 
if Pi^ = ■ ■ ■ = Pi^^j = Yang, Mandal and Majumder (2011) found a necessary 
and sufficient condition for a saturated design to be D-optimal for 2^ main-effects 
model. With the aid of Theorem 13.11 we provide a generalization for 2^ designs as 
follows. Note that Wi > for each i for the commonly used link functions including 
logit, probit, and (complementary) log-log. 



Theorem 3.2 Assume Wi > 0, i = 1, . . . ,2^ . Let I = {ii, . . . , ia+i} C {1, . . . , 2^} 

be an index set satisfying \X[ii, . . . , id+i] \ 7^ 0. Then the saturated design satisfying 
Pii = Pi2 — ' ' ' — Pid+i ~ dTi D-optimal if and only if for each i ^ I, 
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For example, under the 2^ main-effects model, pi = p2 = P3 = 1/3 is D-optimal 
if and only if t^i + f 2 + f 3 < v^, where Vi = 1/wi, i = 1,2,3,4. For the 2^ main- 
effects model, pi = p4 = = P7 = 1/4 is D-optimal if and only if Vi + + vq + 
Vj < 4min{f2, t'a, f5, fs}, and p2 = Ps = = Ps = 1/4 is D-optimal if and only if 
W2 + ^3 + ^^5 + ^^8 < 4min{wi,?;4,W6,^'7}- 

Remark 1 In order to characterize the uniqueness of the optimal allocation, we 
define a matrix = [1, w* 1, w*72, . . . , w*7s], where 1, 72, ... , 7^ are all distinct 
pairwise Schur products of the columns of the design matrix X, w = {wi, . . . , ^2*)', 
and "*" indicates Schur product. It can be seen that any two feasible allocations 
generate the same matrix X'WX as long as their difference belongs to the null space 
of Xyj. If rank(X^) < 2^^, any criterion based on X'WX yields not just a single 
solution but an afiine set of solutions with dimension 2^ — Ta.nk{X^). If rank(Xu;) 
= 2'^, the D-optimal allocation p is unique. For example, for a 2^ design the model 
consisting of all main effects and one two-factor interaction, or for a 2^ design the 
model consisting of all main effects, all two-factor interactions, and one three-factor 
interaction, will lead to a unique D-optimal allocation. 

3.2 EW D-optimal designs 

In order to use the D-optimal designs, we need to specify initial values for the /3j's, 
which gives us the Wj's. In Section O we will examine the robustness of the D- 
optimal design to mis-specification of /3j's. In situations where reliable initial values 
are unavailable, if a prior distribution may be specified for the parameters, one may 
use Bayes optimal designs. An alternative to Bayes optimality is to replace the 
information matrix X'WX in the criterion by its expectation E{X'WX). This is one 
of several alternatives suggested by Atkinson, Donev and Tobias (2007). We call this 
EW optimality. 

Definition: An EW D-optimal design is an optimal allocation of p that maximizes 
\X'E{W)X\. 

One interpretation of EW optimality is that it is local optimality with Wi^s replaced 
by their expectation. In this section we study the properties of EW-optimality. 

A Bayes D-optimal design maximizes -E(log where the expectation is taken 

over the joint prior distribution of /3j's. Note that, by Jensen's inequality, 

E (log \X'WX\) < log \X'E{W)X\ 

since log |X'iyX| is concave in w. Thus an EW D-optimal design maximizes an upper 
bound to the Bayesian D-optimality criterion. 
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We will show that EW-optimal designs are often almost as efficient as Bayes-optimal 
ones with respect to the Bayes optimality criterion, while realizing considerable sav- 
ings in computation time. Furthermore, EW-optimal designs are highly robust in 
terms of maximum relative efficiencies (see Section |3]for details). Note that given link 

function g, Wi = u{'qi) = v (x//3), 2 = 1, . . . , 2'', where v = {{g"'^)') / [g"^{l - g~^)], 
Xj is the ith row of the design matrix, and f3 = (/3o, . . . , /S^)'. 

Suppose the regression coefficients, /3o, . . . , /3d are independent, and /3i , . . . , /3(i all 
have a symmetric distribution about (not necessarily the same distribution), then it 
can be shown that the uniform design with pi = ■ ■ ■ = = is an EW D-optimal 
design for any given link function. While it is an interesting theoretical result that 
characterizes the EW D-optimal designs in terms of the regression coefficients, in most 
situations the experimenter will be able to specify the sign of the /3i's and possibly 
the range. If Pi G [0,/3jtt] for each i, the uniform design will not be EW D-optimal in 
general, as illustrated in the following examples. 



Example 3.1 Consider 2^ experiment with main-effects model. Suppose the exper- 
imenter has the following prior information for the parameters: Poi Pi^ f^2 are inde- 
pendent, /3o ~ 1, 1], and (32 ~ U[0, 1]. Under the logit link, E{wi) = E{wi) = 
0.187, E{w2) = E{w3) = 0.224, and the EW D-optimal design is Pe = (0.239, 0.261, 
0.261, 0.239)'. The Bayes D-optimal design, which maximizes 0(p) = -E'(log |XWX|), 
is Po = (0.235, 0.265, 0.265, 0.235)'. The relative efficiency of Pe with respect to Po is 

r d)(pe) - 0(Po) 1 r -4.80665 - (-4.80642) 1 
exp I ^ ^ ^ > X 100% = exp I ^ ^ x 100% = 99.99%. 

The computational time cost for EW is 0.11 sec, while it is 5.45 sees for maximiz- 
ing 0(p). It should also be noted that the relative efficiency of the uniform design 
p„ = (1/4, 1/4, 1/4, 1/4)' with respect to p^ is 99.88% for logit link, and is 89.6% for 
complementary log-log link (EW design's is 100.00%, see Remark |3]). 



Example 3.2 Consider 2^ experiment with main-effects model. Suppose /3o, Pi, P2, Ps 
are independent, Pq ~ 3,3], and Pi,P2,P3 ~ t/[0,3]. Then E{wi) = E{ws) = 
0.042, E{w2) = E{w3) = ■■■ = Eiwj) = 0.119. Under the logit link the EW D- 
optimal design is given by pe = (0, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6, 0)', and the Bayesian 
D-optimal design is Po = (0.004,0.165,0.166,0.165,0.165,0.166,0.165,0.004)'. The 
relative efficiency of pe with respect to po is 99.98%, while the relative efficiency of 
the uniform design is 94.39%. It is remarkable that it takes 2.39 seconds to find an 
EW solution while it takes 121.73 seconds to find Po. The difference in computational 
time is even more prominent for 2^ case (24 seconds versus 3147 seconds). 
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3.3 Algorithms to search for locally D-optimal allocation 

In this section, we develop efficient algorithms to search for locally D-optimal alloca- 
tions with given Wj's. The same algorithms can be used for finding EW D-optimal 
designs. 

3.3.1 Lift-one algorithm for mciximizing /(p) = |X'VrX| 

Here we propose the lift-one algorithm for searching locally D-optimal p = [pi, . . . , ^2*)' 
with given Wj's. The basic idea is that, for randomly chosen i G {1, . . . , 2^^}, we up- 
date Pi to p* and all the other p/s to p* = pj ■ jzf- This technique is motivated by 
the coordinate descent algorithm (Zangwill, 1969). It is also in spirit similar to the 
idea of one-point correction in the literature (Wynn, 1970; Fedorov, 1972; Muller, 
2007), where design points are added/adjusted one by one. The major advantage of 
the lift-one algorithm is that in order to determine an optimal p*, we need to calculate 
only once due to Lemma [3.11 

Lift-one algorithm: 

1° Start with arbitrary po = (pi, . . . ,^2'=)' satisfying < pj < 1, i = 1, . . . , 2^^ and 
compute / (po)- 

2° Set up a random order of i going through {1,2, . . . ,2'=}. 

3° For each z, determine fi{z) as in ( 1A.8I) . In this step, either /i(0) or fi (|) needs 
to be calculated according to equation 

4° Define pl*^ = (^^Pi, • • • , T^P^+i' • • • ' T^^Pa'^) , where maxi- 

mizes fi{z) with < 2 < 1 (see Lemma [7.2p . Note that /(pi*'') = fi{z^). 

5° Replace po with pl*\ / (po) with /(pl*^). 

6° Repeat 2° ~ 5° until convergence, that is, /(po) = f{v*^) for each i. 

While in all examples that we studied, the lift-one algorithm converges very fast, we 
do not have a proof of convergence. There is a modified lift-one algorithm, which is 
only slightly slower, that can be shown to converge. This algorithm can be described 
as follows. For the lOmth iteration and a fixed order of i = 1, . . . , 2^^ we repeat steps 
3° ~ 5°, m = 1, 2, . . ., if pl*^ is a better allocation found by the lift-one algorithm than 
the allocation po, instead of updating po to pi*-* immediately, we obtain pl*^ for each 

i, and replace po with the ffist best one among |pl*\ i = 1, . . . , 2'^'|. For iterations 
other than the lOmth, we follow the original lift-one algorithm update. 
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Figure 1: Number of support points in the optimal design (based on 1000 simulations) 

Theorem 3.3 When the lift-one algorithm or the modified lift-one algorithm con- 
verges, the converged allocation p maximizes \X'WX\ on the set of feasible alloca- 
tions. Furthermore, the modified lift-one algorithm is guaranteed to converge. 

Our simulation studies indicate that as k grows, the optimal designs produced by 
our lift-one algorithm is supported only on a fraction of all the 2'' design points. To 
illustrate this, we randomly generated the regression coefficients iid from 3,3) 
and applied our algorithm to find the optimal designs under logit link. Figure [1] gives 
histograms of number of support points in the optimal designs. For example, with 
k = 2, 76% of the designs are supported on three points only and 24% of them are 
supported on all four points. As k becomes larger, the number of support points 
moves towards a smaller fraction. 

As demonstrated in Tabled! our lift-one algorithm is much faster than commonly used 
optimization techniques including Nelder-Mead, quasi-Newton, conjugate- gradient, 
and simulated annealing (for a comprehensive reference, see Nocedal and Wright 
(1999)). Our algorithm finds the optimal solution with more accuracy as well. For 
example, for a 2^ design problem, the relative efficiency of the design found by the 
Nelder-Mead method with respect to the design found by our algorithm is only 95% 
on average. There are 10% of the cases with relative efficiencies less than 75%. When 
the number of factors becomes large, our investigation indicates that some algorithms 
do not converge within a reasonable amount of time or are not applicable due to 
unavailability of the gradient of the objective function. These are denoted by "NA" 
in Table [H 
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Table 1: Performance of the lift-one algorithm (CPU time in seconds for 100 simulated 
w) 
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3.3.2 Algorithm for maximizing |X'iyX| with integer solutions 

To maximize |X'iyX|, an alternative algorithm, called exchange algorithm (see Sup- 
plementary Materials for detailed description), is to adjust pi and pj simultaneously 
for randomly chosen index pair The original idea of exchange was suggested 

by Fedorov (1972). Due to Lemma the optimal adjusted (p*,p^) can be ob- 
tained easily by maximizing a quadratic function. Unlike the lift-one algorithm, the 
exchange algorithm can be applied to search for integer-valued optimal allocation 
n = (rii, . . . , n2fc)', where = n. 

Exchange algorithm for integer-valued allocations: 

1° Start with initial design n = (rii, . . . , n2k)' such that /(n) > 0. 
2° Set up a random order of going through all pairs 

{(1, 2), (1,3),..., (1,2^), (2, 3),..., (2^=- 1,2^')} 

3° For each (i, j), let m = nj + nj. If m = 0, let n*^ = n. Otherwise, calculate 
fij{z) as given in equation (lA.lip . Then let 

n*j = (ni, . . . , ni_i, z^, n^+i, . . . , Uj^i, m - 2;*, n^+i, . . . , risfc) 

where the integer z^, maximizes fij{z) with < 2 < m according to Lemma [73] 
in the Appendix. Note that /(n*^) = fij{z*) > /(n) > 0. 

4° Repeat 2° ~ 3° until convergence (no more increase in terms of /(n) by any 
pairwise adjustment). 

As expected, the integer- valued optimal allocation (ni, . . . , n2fe)' is consistent with the 
proportion-valued allocation (pi, . . . ,^2'=)' for large n. For small n, the result may be 
used for fractional design problem in Section HI It should be noted that the exchange 
algorithm with slight modification is guaranteed to converge to the optimal allocation 
when searching for proportions but not for integer-valued solutions, especially when 
n is small compared to 2^. 



11 



In terms of finding optimal proportions, the exchange algorithm produces essentially 
the same results as the lift-one algorithm, although the former is relatively slower. 
For examples, based on 1000 simulations, the ratio of computational time of the 
exchange algorithm over the lift-one algorithm is 6.2, 10.2, 16.8, 28.8, 39.5 and 51.3 
for /c = 2, . . . , 7 respectively. Note that it requires 2.02, 5.38, 19.2, 84.3, 352, and 
1245 seconds respectively to finish the 1000 simulations using the lift-one algorithm 
on a regular PC with 2.26GHz CPU and 2.0G memory. As the total number of 
factors k becomes larger, the computation is more intensive. Detailed study of the 
computational properties of the proposed algorithms is a topic of future research. 

4 Fractional Factorial Designs 

Unless the number of factors k is very small, the total number of experimental con- 
ditions 2^ is large, and it could be impossible to take observations at all 2^ level 
combinations. The reason is twofold: the total number of observations n would be- 
come large, as well as the number of level changes which are expensive in many 
applications. So when k is large, the experimenter may have to consider fractional 
designs. For linear models, the accepted practice is to use regular fractions due to 
the many desirable properties like minimum aberration and optimality. We will show 
that in our setup the regular fractions are not always optimal. First, we will examine 
the situations when they are optimal. 

We use 2^ designs for illustration. More discussion on general cases can be found in 
Section 15.11 The design matrix for 2^ main-effects model consists of the first four 
columns of X given in equation ([1]) and Wj represents the information corresponding 
to the jth row of that matrix. The maximum number of experimental conditions 
is fixed at a number less than 8, and the problem is to identify the experimental 
conditions (rows of X) and corresponding p^'s that optimize the design optimality 
criterion. Half fractions use 4 experimental conditions (hence the design is uniform). 
The fractions defined by rows {1, 4, 6, 7} and {2, 3, 5, 8} are regular fractions, given by 
the defining relations 1 = ABC and — 1 = ABC respectively. If the initial values of 
all the regression coefficients corresponding to the design factors are zeros, it leads to 
the case where all the Wj's are equal and effectively reduces to the linear model, where 
the regular fractions are D-optimal. The following Theorem identifies the necessary 
and sufficient conditions for regular fractions to be D-optimal in terms of wis and 
are valid for all link functions. 

Theorem 4.1 For the 2^ main-effects model, suppose (3i = 0. This implies Wi = w^, 
W2 = wq, W3 = w-i, and W4 = wq- The regular fractions {1, 4, 6, 7} and {2, 3, 5, 8} are 
D-optimal half-fractions if and only if 

4 min{u'i, W2, W3, W4} > max{w;i, W2, w^, W4}. 

Further suppose (32 = 0. Then Wi = W3 = W5 = and W2 = W4 = Wq = Ws- The 
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two regular half-fractions mentioned above are D- optimal half-fractions if and only if 
4min{wi,W2} > max{wi,W2}- 

Example 4.1 Under logit link, consider the 2^ main-effects model with /3i = /32 = 0. 
The regular half- fractions {1,4,6,7} and {2,3,5,8} are D-optimal half-fractions if 
and only if one of the following happens: 



When the regular half-fractions are not optimal, the goal is to find {ii,i2,'>'3,H} that 
maximizes \X[ii,i2,i3,i4\\'^Wi-^Wi2Wi.^Wi^ based on Lemma [XT! Recall that in this case 
there are only two distinct Wi's. If /Jq/^s > 0, Wi's corresponding to {2,4,6,8} are 
larger than others, so this fraction given by C = —1 will maximize Wi^Wi^Wi.^Wi^ 
but this leads to a singular design matrix. It is not surprising that the D-optimal 
half-fractions are "close" to the design {2,4,6,8}, and are in fact given by the 16 
designs each consisting of three elements from {2, 4, 6, 8} and one from {1, 3, 5, 7}. For 
/So/Ss < 0, D-optimal half-fractions are similarly obtained from the fraction C = +1. 

Figure [2] below partitions the parameter space for 2^ main-effects logit model. The left 
panel corresponds to the case Pi = = 0. Here the parameters in the middle region 
would make the regular fractions D-optimal whereas the top-right and bottom-left 
regions correspond to the case /So/Ss > 0. Similarly the other two regions correspond 
to the case PqP^ < 0. The right panel of Figure [2] is for the case Pi = and shows the 
contour plots for the largest |/3o|'s that would make the regular fractions D-optimal. 
(For details, see the Supplementary Materials of this paper.) Along with Figure [2l 
conditions ([3]) and fIS.lOl) in Supplementary Materials indicate that if Pi,P2 and P3 
are small then the regular fractions are preferred (see also Table [2] below). However, 
when at least one \Pi\ is large, the information is not uniformly distributed over the 
design points and the regular fractions may not be optimal. 

In general, when all the /9j's are nonzero, the regular fractions given by the rows 
{1,4,6,7} or {2,3,5,8} are not necessarily the optimal ones. To illustrate this, we 
simulate the regression coefficients Po, Pi, P2, P3 independently from different distri- 
butions and calculate the corresponding w's under logit, probit and complementary 
log-log links for 10,000 times each. For each w, we find the best design supported 
on 4 distinct rows of the design matrix. By Lemma 13. H any such design has to be 
uniform in order to be D-optimal. Table [2] reports the percentages of times each of 
those designs turn out to be the optimal ones for the logit model. The results are 
somewhat similar for the other links. It shows that the regular fractions are optimal 
when the /3j's are close to zero. In Table [21 we only reported the fractions which 
turned out to be D-optimal for more than 15% of the times with the exception of 
regular fractions. For the 2^ case, the results are similar, that is, when the /3j's are 



(i) m<\og2 



(3) 



{ii) l/^sl > log2 and 
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Figure 2: Partitioning of the parameter space 
Table 2: Distribution of D-optimal half-fractions under 2^ main-effects model 



Rows 


Percentages 








u{- 


10,10) 




iV(0,5) 


Simulation 




C/(-.3,.3) 


C/(-3,3) 


(7(-3,0) 


[/(0,1) 


N{1,1) 


Setup 




C/(-.3,.3) 


C/(0,3) 


C/(0,3) 


[/(0,3) 


7V(2,1) 






C/(-.3,.3) 


C/(l,5) 


[/(-2,2) 


C/(0,5) 


A^(3,l) 


1467 




42.65 (99.5) 


0.07 (0.22) 


0.86 (2.18) 


0.95 (2.61) 


0.04 (2.89) 


2358 




42.02 (99.5) 


0.04 (0.23) 


0.68 (2.12) 


1.04 (2.56) 


0.08 (2.86) 


1235 






16.78 




35.62 


21.50 


1347 








19.98 






1567 






17.45 


19.21 






2348 






17.54 


19.11 






2568 








20.01 






4678 






16.12 




35.41 


21.65 



nonzeros, the performance of the regular fractions given by 1 
efficient in general. 



±ABCD are not very 



Relative efficiency of a design is defined as 1 — R{p, w) where -R(p, w) is the relative 
loss of efficiency given in equation of Section 15.11 In Table [H we have provided, 
within parenthesis, the percentages of times the regular fractions were at least 95% 
efficient compared to the best half-fractions. It is clear that when the regular fractions 
are not D-optimal, they are usually not highly efficient either. On the other hand, for 
each of the five situations described in Table |2l we also calculated the corresponding 
EW D-optimal half-fractions. For all of the five cases, including the highly asymmetric 
fifth scenario, the condition for Theorem 14.11 is satisfied and the regular fractions are 
the EW D-optimal half-fractions. 



Remark 2 Consider the problem of obtaining the locally D-optimal fractional fac- 
torial designs when the number of experimental settings (m, say) is fixed. If the total 
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number of factors under consideration is not too large, one can always calculate the 
D-efficiencies of all fractions and choose the best one. However, we need a better 
strategy for even moderately large number of factors. One such strategy would be to 
choose the m largest Wj's and the corresponding rows, since the Wj's are the informa- 
tion at that point. Another one could be to use our algorithms discussed in Section 
for finding an optimal allocation for the full factorial designs first. Then choose the 
m largest p^'s and scale them appropriately. One has to be careful in order to avoid 
designs which would not allow the estimation of the model parameters. In this case, 
the exchange algorithm described in Section PJ. 3. 21 may be used to choose the fraction 
with given m experimental units. Our simulations (not presented here) shows that 
both of these methods perform satisfactorily with the second method giving designs 
which are generally more than 95% efficient for four factors with main-effects model. 
This method will be used for computations in the next section. 

5 Robustness 

In this section, we will check the robustness of locally D-optimal designs over the 
initial parameter values, for both full factorial and fractional factorial setups. 

5.1 Most robust saturated designs 

Saturated designs have been studied extensively. For continuous, quantitative factors, 
these designs are D-optimal for many linear and non-linear models. In our setup 
of qualitative factors, saturated designs are attractive since they use the minimal 
number, (d + l), of experimental conditions. In many applications, level changes are 
expensive, hence fewer experimental conditions are desirable. In this section, we will 
examine the robustness of saturated designs. Since here the design matrix has {d+ 1) 
columns, it is straightforward to find the locally D-optimal saturated designs as given 
in Theorem 15. ![ which immediately follows from Lemma 13.11 

Theorem 5.1 Let I = {ii, . . . , id+i} C {1, . . . , 2^} be an index set. A design p/ = 
(pi, . . . , p2k)' satisfying pi = 0,'^i ^ I is a D-optimal saturated design if and only if 

Pii = ■■■ = Pia+i = I maximizes \X[ii, . . . ,id+i]\^Wi^ ■ --Wi^^^. 

For investigating the robustness of these designs, let us denote the D-criterion value 
as V'IP; w) = |XWX| for given w = {wi, . . . , W2ky and p = (pi, . . . ,P2k)' ■ Suppose 
Pit, is a D-optimal allocation with respect to w. Then the relative loss of efficiency of 
p with respect to w can be defined as 
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Let us define the maximum relative loss of efficiency of a given design p with respect 
to a specified region W of w by 

^max(p) = maxi?(p,w). (5) 

It can be shown that W takes the form [a, b]"^^ for 2'' main-effects model if the range 
of each of the regression coefficients is an interval symmetric about 0. For example, 
for a 2^ main-effects model, if all the regression coefficients range between (—3,3), 
then W = [3.06 x 10"^ 0.25]^^ for logit link, [8.33 x 10"^^ 0.637]i6 for probit link, or 
(0, 0.648]^^ for (complementary) log-log link. This justifies the choice of the range of 
Wj's in Theorem 15.21 below. A design which minimizes the maximum relative loss of 
efficiency will be called most robust. 

Theorem 5.2 Suppose k > 3 and d{d + 1) < 2^^"^^ — 4. Suppose Wi G [a, b], 
i = 1, . . . ,2'' , < a < b. Let I = {ii, . . . ,id+i} be an index set which maximizes 
\X[ii,i2,-- . Then the design p/ = (pi, . . . ,^20' satisfying pi^ = ... = ^-^^^ = 

is a most robust saturated design with maximum relative loss of efficiency 1 — f ■ 

Based on Theorem 15. 21 the maximum relative loss of efficiency depends on the possible 
range of Wj's. The result is practically meaningful only if the interval [a, b] is bounded 
away from 0. Figure [3] provides some idea about the possible bounds of Wi's for 
commonly used link functions. For example, for 2^ designs with main-effects model, 
suppose we know that 0.105 < Wi < 0.25 under logit link (see Remark 4.1.1 of Yang, 
Mandal and Majumdar (2012)), then the maximum relative loss of efficiency of the 
regular half-fractional design satisfying pi = p4 = = P7 = 1/4 is 1 — 0.105/0.25 = 
58%. The more certain we are about the range of iWj's, the more useful the result will 
be. 

Note that for main-effects models, the condition d{d+l) < 2'^'^^ — 4 in Theorem 15. 2l is 
guaranteed by A; > 3. A most robust saturated design can be obtained by searching 
an index set {ii, . . . , id+i} which maximizes \X[ii, i2, ■ ■ ■ , Note that such an 

index set is usually not unique. Based on Lemma [731 if the index set {zi, . . . , id+i} 
maximizes \X[ii, . . . there always exists another index set {i[, . . . such 

that \X[ii, . . . , = \X[i[, . . . , It should also be noted that a most robust 

saturated design may involve a set of experimental conditions {ii, . . . ,id+i} which 
does not maximize |X[zi, . . . , z^+i]!^. For example, consider a 2^~^ design with main- 
effects model. Suppose Wi G [a,b], i = 1,...,8. If 4a > b, then the most robust 
saturated designs are the regular 2^~^ fractional ones. Otherwise, if 4a < b, then 
any uniform design restricted to {ii, 12, is, ^4} satisfying |X[zi, 12, i^, i4\ \ 7^ is a most 
robust saturated one. 

5.2 Robustness of uniform designs 

A design is called "uniform" if the allocation of experimental units is the same for 
all points in the support of the design. Yang, Mandal, and Majumdar (2011) showed 



16 



Table 3: Relative loss of efficiency of 2^ uniform design 



Simulation 
Setup 


Percentages 

Pq^U{-3,3) 1) t/(-3,0) iV(0,5) 
/?i~C/(-l,l) U{0,1) C/(l,3) iV(0,l) 
/32~C/(-l,l) U{0,1) C/(l,3) iV(2,l) 
/33~C/(-l,l) U{0,1) t/(-3,-l) iV(-.5,2) 
/?4~C/(-l,l) [/(0,1) t/(-3,-l) A^(-.5,2) 


Quantiles 

^99 
^95 
^90 


(I) (II) (III) (I) (II) (III) (I) (II) (III) (I) (II) (III) 
.348 .353 .348 .146 .111 .112 .503 .273 .299 .650 .864 .726 
.299 .304 .299 .128 .094 .093 .495 .251 .256 .617 .788 .670 
.271 .274 .271 .117 .084 .085 .488 .239 .233 .589 .739 .629 


Note: (I) = i?iooQ(Pu), (H) = ^^min^^i?iooa(p.s), (HI) = i?iooa(Pe)- 



that for a 2^ main-effects model, the uniform design is the most robust design in terms 
of maximum relative loss of efficiency. In this section, we use simulation studies to 
examine the robustness of uniform designs and EW D-optimal designs for higher order 
cases. 

For illustration, we use a 2'^ main-effects model. We simulate (3o, . . . , fS^ from different 
distributions 1000 times each and calculate the corresponding w's, denoted by wi, 
. . ., wiooo- For each w^, we use the algorithm described in Section l3.3.1l to obtain a 
D-optimal allocation p^. For any allocation p, let -Riooo(p) denote the ath quantile 
of the set of relative loss of efficiencies {-R(p, w^), s = 1, . . . , 1000}. Thus -Rioo(p) = 
-Rmax(p) which is the -Rmax defined in ([5]) with W = {wi, . . . , wiooo}- Since W here 
is simulated, the quantities -R99(p) or i?95(p) are more reliable in measuring the 
robustness of p. 

Table [3] compares the -Riooa of the uniform design p„ = (1/16,. ..,1/16)' with the 
minimum of -Riooa(Ps) for s = 1, . . . , 1000, as well as the -Riooa of the EW design pe. 
In this table, if the values of column (I) is smaller than those of column (II), then we 
can conclude that the uniform design is better than all the D-optimal designs in terms 
of the quantiles of relative loss of efficiency, which happens in many situations. It is a 
strong evidence supporting that the uniform design p„ is one of the most robust ones 
if the /3j's occupy symmetric range around zero. This is consistent with the conclusion 
of Cox (1988). 

However, there are situations where the uniform design does not perform well, as 
illustrated by the two middle blocks of Table [31 If the signs of the regression coeffi- 
cients are known, it is advisable not to use the uniform design. For many practical 
applications, the experimenter will have some idea about the direction of effects of 
factors, which in statistical terms determines the signs of the regression coefficients. 
For these situations, it turns out that the performance of the EW D-optimal designs 
is comparable to that of the most robust designs, even when the uniform design does 
not perform well (see columns (III) in Table [3], where Pe is the EW design). Hence we 
recommend the use of EW D-optimal designs when the experimenter has some idea 
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about the signs of /3i's. Uniform designs are recommended in the absence of prior 
knowledge of the regression parameters. 

Now consider the uniform designs restricted to regular fractions. Again we use 2^ 
main-effects model as illustration and consider the uniform designs restricted to the 
regular half-fractions identified by 1 = ±ABCD. We performed simulations as above 
and our conclusions are similar, that is, uniform designs on regular fractions are among 
the most robust ones if the regression parameters are supported symmetrically around 
zero but they may not perform well if the /3j's are either positive or negative. 

6 Examples and Discussion 

In this section we make some remarks, with illustrations, about our results and their 
use. 

Remark 3 The logit link is the most commonly used link in practice. The situation 
under this link function is close to that in the linear model case because Wj's do not 
vary much {0 < Wi < 0.25), hence the uniform design is expected to perform not 
too poorly, since they are optimal for the linear model case. But this is not the case 
for other link functions. In general, the performance of the logit and probit links 
are similar, while that of the complementary log-log link is somewhat different from 
others. For example, if we repeat Example 13.11 but under complementary log-log 
link, the relative efficiency of the uniform design is only 89.6% with respect to Bayes 
optimal design. Figure [3] provides a graphical display of the function u for commonly 
used link functions. As seen from the figure, complementary log-log link function is 
not symmetric about 0. This explains the poor performance of the uniform design 
under this link. Nevertheless, the EW D-optimal designs are still highly efficient across 
different link functions. Repeating Example 13.11 with commonly used link functions, 
the relative efficiencies of EW designs with respect to the corresponding Bayesian 
optimal designs are 99.99% (logit link), 99.94% (probit link), 99.77% (log-log hnk), 
and 100.00% (complementary log-log link), respectively. 



Example 6.1 Consider the Windshield molding experiment discussed in Section [T] 
By analyzing the data presented in Hamada and Nelder (1997), we get an estimate of 
the unknown parameter as /3 = (1.77, —1.57, 0.13, —0.80, —0.14)' under logit link. If 
one wants to conduct a follow-up experiment on half-fractions, then it is sensible to 
use the knowledge obtained by analyzing the data. In this case, the efficiency of the 
original design is 78% of the locally D-optimal design if f3 were the true value of 
the unknown parameter f3. Instead of taking f3 exactly if a point close to it is taken 
as the initial value, say, f3 = (2, —1.5, 0.1, —1, —0.1)', the D-optimal design is given 
in Table |H A reasonable option is to consider a range for the possible values of the 
regression parameters, namely, (1, 3) for (3q, (—3, —1) for (—0.5, 0.5) for (32, f^^, and 
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Figure 3: Wi = z^(?7i) = i^(x^/3) for commonly used link functions 

(—1,0) for Ps. For this choice of range for the parameter values with independence 
and uniform distributions, the EW D-optimal half-fractional design pe is given in 
Table H] too. Note that if we regard (3 as the true value of the parameter, the relative 
efficiency of increases to 99%, whereas that of Pe increases to 98%. We have also 
calculated the linear predictor rj and success probability vr for all possible experimental 
settings. It seems that a good fraction would not favor high success probabilities very 
much. This is one of the main differences between the design reported by Hamada 
and Nelder (denoted by Phn) and our designs (denoted by p^ and Pe). Note that 
these two designs have six rows in common. 

Example 6.2 Consider the "Project Quit" experiment discussed in Section [H The 
authors noted that a full factorial experiment with 6 factors requiring 64 different 
settings was not feasible. So they conducted a 2^^^ experiment defined by E = ABC 
and F = ACD. The authors provided us with a dataset which was very similar 
to that one presented in their paper. By analyzing the data, we get an estimate of 
the unknown parameter as ^ = (-1.10, 0.01, 0.06, 0.09, 0.02, 0.10, -0.04)' under logit 
link. Unlike the previous example, here the efficiency of the original design used by 
Nair et al. (2008) is more than 99% of the locally D-optimal design if /3 were the 
true value of the unknown parameter /3. Hence, if someone wants to conduct a future 
experiment under similar set up, it will be advisable to use the same design. In this 
case, only factors C and E come out to be significant and that information can be 
used to reduce the number of factors, if needed. 

Example 6.3 Consider the Mice experiment discussed in Section [1] The design used 
by Seo, Goldschmidt- Clermont and West (2007) is very efficient if all the regression 
coefficients of the factors are close to zeros, which leads to equal Wj's and corresponds 
to a linear model. However, their design is not very efficient if the /3j's are not close 
to zero. To illustrate this, we simulate 1000 /3's from different distributions, and for 
each of them calculate the optimal allocation. Then we compare the efficiency of each 
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Table 4: Optimal half-fraction design for Windshield Molding Experiment 



Row 


A 


B 


C 


D 


V 


TT 


Phn 


Pa 


Pe 


5 


+ 1 


— 1 


+1 


+1 


-0.87 


0.295 






0.044 


0.184 


1 


+ 1 


+1 


+ 1 


+ 1 


-0.61 


0.352 





125 


0.178 


0.011 


6 


+ 1 


— 1 


+ 1 


— 1 


-0.59 


0.357 





125 


0.178 


0.011 


2 


+1 


+ 1 


+ 1 


— 1 


-0.33 


0.418 






0.059 


0.184 


7 


+ 1 


— 1 


— 1 


+ 1 


0.73 


0.675 





125 


0.163 




3 


+ 1 


+ 1 


— 1 


+ 1 


0.99 


0.729 








0.195 


8 


+1 


— 1 


— 1 


— 1 


1.01 


0.733 








0.195 


4 


+ 1 


+ 1 


— 1 


— 1 


1.27 


0.781 





125 


0.147 




13 


— 1 


— 1 


+ 1 


+ 1 


2.27 


0.906 


U 




0.158 


0.111 


9 


-1 


+ 1 


+ 1 


+ 1 


2.53 


0.926 










14 


-1 


-1 


+1 


-1 


2.55 


0.928 










10 


-1 


+ 1 


+1 


-1 


2.81 


0.943 





125 


0.074 


0.110 


15 


-1 


-1 


-1 


+ 1 


3.87 


0.980 










11 


-1 


+ 1 


-1 


+ 1 


4.13 


0.984 





125 






16 


-1 


-1 


-1 


-1 


4.15 


0.984 





125 






12 


-1 


+ 1 


-1 


-1 


4.41 


0.988 











Table 5: Performance of Seo et al.'s design 





Relative EfBciencies 


Simulation 
Setup 


^0- C/(-10,10) 
/3i ~ [/(-.3,.3) L/(-3,3) [/(-3,3) t/(-3,0) U{0,1) 
C/(-.3,.3) C/(-3,3) [/(0,3) [/(0,3) [/(0,2) 
/33- C/(-.3,.3) C/(-3,3) [/(1, 3) t/(-2,2) [/(0,3) 
/34 ~ [/(-.3,.3) L/(-3,3) [/(2,5) U{-l,l) C/(0,5) 


Mean 
SD 


0.940 0.503 0.446 0.566 0.547 
0.029 0.121 0.090 0.121 0.118 



of these optimal designs with that of the design used by Seo et al. (2007) and report 
the mean and standard deviations of those efficiencies in Table [51 

As noted in the introduction and justified by the results in Section 15.2^ if the ex- 
perimenter has no prior knowledge of the parameters then it is recommended to use 
the uniform design. Otherwise, EW optimal designs should be used because they are 
very robust in terms of minimizing the maximum relative loss. In the EW optimality, 
we replace the WiS by their expectations. Note that taking the average of WiS is 
not same as taking the average of /Sj's. Let us illustrate it with 2^ design with main- 
effects model. Table[6]below uses the notations from Table|3l Suppose /3o ~ 3, 0), 
Pi, P2 ~ ^^(1,3), /33,/34 ~ [/(— 3, — 1), and the /3j's are independent. It is clear that 
uniform design performs much worse compared to the most robust design, while the 
performance of the EW D-optimal design is comparable with the best design. The 
last column corresponds to the locally D-optimal design where the initial value of the 
parameter is taken to be the midpoints of the ranges of /3j's mentioned above. Clearly 
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Table 6: Performance of different designs for 2^ main-effects model 





Uniform 


Most robust 


EW 




R99 


0.503 


0.273 


0.299 


0.331 


R95 


0.495 


0.251 


0.256 


0.284 


R90 


0.488 


0.239 


0.233 


0.251 



this is worse than the EW optimal design. 
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Appendix 

We need two lemmas before the proof of Theorem 13. 1[ 



Lemma 7.1 Suppose p = (pi, . . . ,^2'^)' satisfies f (p) > 0. Given i = 1, . . . ,2^ , 

fi{z) = az{l-z)^ + b{l-zf^\ (A.8) 

for some constants a and b. If pi > 0, b = fi{0), a = ^^^'^~^^^~^^d — / otherwise, 
b = f (p), a = fi (i) ■ 2*^+1 - b. Note that a > 0, b > 0, and a + b > 0. □ 



Lemma 7.2 Let h{z) = az{l - z^ + 6(1 - with < z < 1 and a > 0,b > 

0,a + b>0. Ifa>bid+l),thenmax,hiz) = {^Y{^Y-'' at z = ^^^^ < 1. 
Otherwise, max^ h{z) = b at z = 0. □ 



Proof of Theorem I3.lt Note that /(p) > implies and < Pi < 1 for each 
i = 1, ... ,2^. Since '^^Pi = 1, without any loss of generality, we assume P2k > 0. 

Define p^ = (pi, . . . ,P2'=-i)', and f^^'^Pr) = f (^Pi, ■ ■ • ,P2'--i, 1 - Ylf=1^Pi)- 

For z = 1, . . . , 2'' - 1, let S'-p = {-pi, . . . , -pi-i, 1 - Pi, -Pi+i, . . . , -P2'^-i)'. Then 
fi{z) = f'''^'\pr + uSl''^) with u = jz^- Since the determinant \ . . . , S^^^_-^)\ = 
P2k 7^ 0, 5^^\ . . . , ^2^-1 are linearly independent and thus may serve as a new basis of 

= {(pi,...,P2fc„i)' I < 1, and >0,^ = 1,..., 2^-1}. (A.9) 

i=l 
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Since log /^^'■'(pr) is concave, maximizes if and only if along each direction d 



(r) 
i ' 



(r)^ 



du 



if Pi > 0; < otherwise. 



u=0 



That is, fi{z) attains its maximum at z = pi, for each i = 1, . . . ,2'^ — 1 (and thus for 
i = 2^). Based on Lemma [7. II and Lemma [7.21 it implies one of the two cases: 

(i) = and /, (i) ■ 2^+1 - /(p) < /(p)(rf+ 1); 

(ii) Pi > 0, a > h{d + 1), and a — b{d + 1) = pi{a — b){d + 1), where b = /i(0), and 

„ _ f(P)-b{i-Pir+' 

The conclusion needed can be obtained by simplifying those two cases above. □ 

Proof of Theorem 13. 2t Let p/ be the saturated design satisfying Pi^ = Pi2 = ■ ■ ■ = 
Pid+i ~ d+l- Note that if \X[ii,i2, . . . = 0, p/ can not be D-optimal. Suppose 

\X\i 1, Z2, . . . , id+i]\ 7^ 0, p/ is D-optimal if and only if p/ satisfies the conditions of 
TheoremO By LemmaEHl f{pi) = + is, • • . ,id+i]\'^WhWi2 ■ ■ ■ ^i^+i- 

For i e I, Pi = /j(0) = 0. By case (ii) of Theorem 13.11 Pi = maximizes fi{x). 
For i ^ I, Pi = 0, 

fJl-] = [2(d+ir('^+^)|x[^i,...,2,+i]iv---^..+i 



l^[Wui\{j}] 



2 



W 



3 



i6i 

Then Pi = maximizes fi{x) if and only if fi (|) < f{p)-^^, which is equivalent to 



E 



\x{{i}ui\i]}]r ^ \x{h,i2,...,u+i] 



□ 



We need the lemma below for Theorem 15. 2t 

Lemma 7.3 Suppose k > 3 and d{d + 1) < 2^"*"^ — 4. For any index set I = 
{ii, . . . , id+i} C {1, . . . , 2^}, there exists another index set V = {i[, . . . , such 
that 

\X[i,, p = \X[t[, t',^,]\^ and I n r = 0. (A.IO) 

Proof of Theorem 15. 2t Fixing any row index set / = {ii, . . . ,id+i} of X such 
that |X[2i,'i2, • • • > 0, among all the {d + l)-row fractional designs satisfying 

Pi = 0, Vz ^ I, |XWX| attains its maximum {-^) Wi^ ■ ■ ■ Wi^^-^-\X\ii, i2, ■ ■ ■ , 
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at p/ satisfying Pi^ = ■ ■ ■ = Vid+i — dTT- Lemma 17.31 there exists an index set 
r = . . . such that |X[i;, . . . = . . . and I n I' = 0. Let 

w// = {wi, . . . ,102^)' satisfy Wi = b,Wi E V and Wi = a,Wi E I. Then the relative loss 
of efficiency of p/ with respect to w// is 

\4j{pr,wp)J b 

Thus the maximum relative loss of efficiency of pi is at least 1 — f • If I maximizes 
. . . , then p/'s maximum relative loss of efficiency attains the minimum 

1-f. □ 

We need two lemmas for the exchange algorithm for integer-valued allocations. 

Lemma 7.4 Let g{z) = Az{m — z) + Bz + C{m — z) + D for real numbers A > 
0, B > 0,C > 0, D > 0, and integers m > 0,0 < z < m. Let A be the integer closest 

mA+B-C 
'-'^ 2A 

(i) IfO < A< m, thenmaxo<^<rng{z) = mC+D+{mA+B-C)A-AA'^ at z = A. 
(a) If A < 0, then ma.xo<z<m = fnC + D at z = 0. 
(Hi) If A > m, then maxo<z<m = fnB + D at z = m. 

Lemma 7.5 Let n = (rii, . . . , 77-2*)', Wn = diagjniiyi, . . . , n2kW2k}, f{n) = \X'WnX\. 
Fixing I <i < j <2^ , let 

fiji^) = f {ni, . . . ,ni_i,z,ni+i, . . . ,nj^i,m - z,nj+i, . . . ,n2k) 

= Az{m - z) + Bz + C{m - z) + D, (A.ll) 

where m = Ui + Uj. Then (i) D > =^ B > and C > 0; (ii) B > or C > 
^ A > 0; (ill) /(n) > ^ A > 0; (iv) D = /(ni, . . . , n^-i, 0, rij+i, . . . , 
nj-i,0,nj+i, . . . , n2k). (v) Suppose m> 0, then A = [2fij (f ) - fij{0) - fij{m)), 
B = ^{f,,{m)-D),C = ^{f,,{0)-D). 
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Supplementary Materials 

Additional Results for Example 4.1: Consider a 2^ main-effects model with logit 
link. Suppose /3i = 0. As a corollary of Theorem 14.11 the regular fractions {1, 4, 6, 7}, 
{2, 3, 5, 8} are D-optimal half-fractions if and only 

4 m + m + m) > v f i/3oi + \h + m - 2 max \^\\ . 

Note that vij]) = logit link, which is symmetric about 0. To simplify 

the notations, let /32v3 = niax{|/32|, I/Ssi} and /32a3 = niin{|/32|, l/Ss]}. The regular 
fractions {1, 4, 6, 7}, {2, 3, 5, 8} are D-optimal half-fractions if and only if one of three 
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conditions below is satisfied: 



(i) I/32I + I/33I <log2; 

(ii) I/32I + m > log 2, /32V3 < log (1 + + [1 + e-/32A3 + g-2/32A3] V2j 



(S.IO) 




(iii) /32V3 > log (1 + e-^'^' + [1 + e-^'^' + 6"^^^^^] '^') , \(32v3\ < log 



2el/32A3| _ 1 



) 



el/32A3| _ 2 




The above result is displayed in the right panel of Figure [21 In the x- and y-axis, we 
have plotted P2 and (3^ respectively. The rhomboidal region at the center (marked 
as 00) represents the region where the regular fractions will always be D-optimal, 
irrespective of the values of (5q. The contours outside this region are for the upper 
bound of \Po\- Regular fractions will be D-optimal if the values of |/3o| will be smaller 
than the upper bound with /32 and /^s falling inside the region outlined by the contour. 



More proofs 

Proof of Theorem l4.lt Given f3i = 0, we have wi = = z^(/3o+/32+/33), W2 = wq = 
'^(/3o+/32-/33), W3 = W7 = iy{f3o-l32+f33), W4 = ws = z^(/3o-/32-/33)- The goal is to find 
a half- fraction I = {zi, Z2, ^3, which maximizes := \X[ii,i2,i3,ii]\'^Wij^Wi^Wi.jWi^. 
For regular half-fractions I = {1,4,6,7} or {2,3,5,8}, = 256W1W2W3W4. Note 
that ^2, ^3, hW"^ = for 12 half-fractions identified by 1 = ±^4, 1 = ±B, 1 = ±C, 

1 = ±AB, 1 = ±AC, or 1 = ±BC; and \X[ii, 12, is, ^4]^ = 64 for all other 56 cases. 

Without any loss of generality, suppose wi > 'u;2 > > W4. Note that the half- 
fraction {1,5,2,6} identified by 1 = i? leads to = 0. Then the competitive 
half-fractions consist of both 1 and 5, one element from the second block {2, 6}, and 
one element from the third block {3, 7}. The corresponding = 6AWIW2W3. In this 
case, the regular fractions are optimal ones if and only if Aw^ > Wi. □ 

Proof of Lemma Note that A; > 3 and d{d + 1) < 2^+^ - 4 imply d+l< 2^^^ 
and f^i^^tll < 2^ — 1. Fix an arbitrary index set I = {ii, . . . , id+i} C {1, . . . , 2*^}. It 
can be verified that there exists a nonempty subset J C {1, 2, ... , k}, for example 
J = {1, 2}, such that (i) the rows ii, ■ . . , id+i of [—Ai, —A2, A3, . . . , Ak] are the same 
as the rows i'l, ■ ■ ■ , of [Ai, A2, A3, . . . , A^], where Ai, . . . ,Ak are columns of X 
corresponding to the main effects; (ii) V = {i'l, . . . ,"^^+1} satisfies conditions (lA.lOp . 
Actually, \X[ii, . . . , id+iW^ = \^[ii, ■ ■ ■ , "^d+i] P is guaranteed by the construction of I'. 
If I n I' 7^ 0, there exist two rows of I, for example ii,i2, such that the two rows have 
same entries at A3, . . . , A^ columns and different entries at Ai, A2 columns. Note that 
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the row pair (21,^2) corresponds to a unique subset J. There are 2^ — 1 possible J 
but only ^^ii^ll possible pairs. Since 'M^l < 2*^ — 1, there is at least one J such that 
there is no pair of rows corresponding to it. That is, I fl I' = for such a J. 



Exchange algorithm for real-valued allocations 

Lemma 9.6 Let g{z) = Az{e — z) + Bz + C{e — z) + D for nonnegative constants 
A, B, C, D, e. Define A - '^^+b-c 



2A 



(i) If < A < e, then ma,XQ<z<e g{z) = eC + D + at z = A. 

(a) If A < 0, then maxo<^<e = eC + D at z = 0. 
(in) If A > e, then maxo<^<e = eB + D at z = e. 



Lemma 9.7 Let p = (pi, . . . ,^2'=)'? /(p) = l-'^W-^l, and 

hiz) := /(pi,...,p,_i,2;,p,+i,...,pj_i,e-2,pj+i,...,p20 

= Az{e - z) + Bz + C{e - z) + D, 

where I <i < j and e = Pi+Pj. Then (i) D > =^ B > and C > 0; (ii) B > 
orC >0^ A>0; (m) /(p) > ^ A > 0; (iv) D = /(pi, . . . , 0,pi+i, . . . , 
Pj_i,0,pj+i, . . . , P2k); (v) Suppose e > 0, then A = ^ {2fij (|) - fij{0) - fij{e)), 
B = \{f,,{e)-D),C = \{f,,{Q)-D). 

Exchange algorithm for maximizing /(p) = /(pi, . . . ,^'2'=) = |XWX| 

1° Start with an arbitrary design p^^^ = {p^i \ ■ ■ ■ ,^2°^ such that /(p*-*^^) > 0. 
2° Set up a random order of (i, j) going through all pairs 

{(1, 2), (1,3),..., (1,2'=), (2, 3),..., (2^=- 1,2'=)}. 

3° For each (2, j), if e := ^pf^ — 0' P''"^^ — P''°^ ^^"^ jump to 5°. Otherwise, 
let 

fij {z) = f (pf^ , • • • , pf\ , z, pfl^ ,...,pfl^,e- z, pfl^ p'-^i^ ) 
= Az{e - z) + Bz + C{e- z) + D 

with nonnegative constants A, B, C, D determined by Lemma [9.71 
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4° Define p^^^ = (^pf\ pf\, z^.pf}^, pfl^, e - z^,pfl^, j where 

maximizes fij{z) with < z < e (see Lemma [9l6|) . Note that /(p^^^) = fij{z^) > 
/(pW) > 0. 

5° Repeat 2° ~ 4° until convergence (no more increase in terms of /(p) by any 
pairwise adjustment). 

Theorem 9.1 If the exchange algorithm converges, the converged maximizes \X'WX\. 

Proof of Theorem 19. ll Suppose the exchange algorithm converges at p* = {p\, . . . ,P2k)'- 
According to the algorithm, |XWX| > at p*. Without any loss of generality, as- 
sume > 0. Let p; = (pI, . . .,plk_i), IriPr) = logfriPr), and /r.(Pr-) = /(Pl, • • • , 

P2^~i, l-Et'p.)- Then for 2 = 1,..., 2'^-!, £ 

0; < 0, otherwise. Thus p* (or p*) locally maximizes /(p) (or /r(Pr)), and p* attains 
the global maximum of /(p) on S. □ 

Similar to the lift-one algorithm, we may modify the exchange algorithm so that p*^''-* 
won't be updated until all potential pairwise exchanges among pj's have been checked. 
It can be verified that the modified exchange algorithm must converge. 



1 



dfr 



0, ifp* > 
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